You will be able to
body_mass_g
:
species, island, bill_length_mm, bill_depth_mm, flipper_length_mm, sex
Training-Set | Validation-Set: | Test-Set |
---|---|---|
for fitting the models | or selecting the models (e.g., which parameters to include) | hold-out set of data, to prove that we selected a good model for any data |
data = [3, 7, 13, 15, 22, 25, 50, 91]
training = [3, 7, 13, 50]
validation = [15, 91]
test = [22, 25]
data = [3, 7, 13, 15, 22, 25, 50, 91]
test = [22, 25]
training_1 = [7, 13, 15, 50, 91]
validation_1 = [3]
training_2 = [3, 13, 15, 50, 91]
validation_2 = [7]
...
data = [3, 7, 13, 15, 22, 25, 50, 91]
data = [3, 7, 13, 15, 22, 25, 50, 91]
test = [22, 25]
training_fold_1 = [13, 15, 50, 91]
validation_fold_1 = [3, 7]
training_fold_2 = [3, 7, 50, 91]
validation_fold_2 = [13, 15]
...
Training-Set | Validation-Set | Test-Set |
---|---|---|
used to train the models | used to select the model, predictors and/or parameters | used to prove the models performance |
5-fold CV (with Validation) | 5-fold CV (with Training) | cut out at the beginning |
Try to find the best possible linear model for predicting the penguin weight using
5.1 Cross-Validation and Model Selection - Resampling
40 minutes
Now, You can